State gradients for analyzing memory in LSTM language models
نویسندگان
چکیده
منابع مشابه
Character-Word LSTM Language Models
We present a Character-Word Long ShortTerm Memory Language Model which both reduces the perplexity with respect to a baseline word-level language model and reduces the number of parameters of the model. Character information can reveal structural (dis)similarities between words and can even be used when a word is out-of-vocabulary, thus improving the modeling of infrequent and unknown words. By...
متن کاملRegularizing and Optimizing LSTM Language Models
In this paper, we consider the specific problem of word-level language modeling and investigate strategies for regularizing and optimizing LSTM-based models. We propose the weight-dropped LSTM, which uses DropConnect on hidden-tohidden weights, as a form of recurrent regularization. Further, we introduce NTAvSGD, a non-monotonically triggered (NT) variant of the averaged stochastic gradient met...
متن کاملUsing Sentence-Level LSTM Language Models for Script Inference
There is a small but growing body of research on statistical scripts, models of event sequences that allow probabilistic inference of implicit events from documents. These systems operate on structured verb-argument events produced by an NLP pipeline. We compare these systems with recent Recurrent Neural Net models that directly operate on raw tokens to predict sentences, finding the latter to ...
متن کاملState Space LSTM Models with Particle MCMC Inference
Long Short-Term Memory (LSTM) is one of the most powerful sequence models. Despite the strong performance, however, it lacks the nice interpretability as in state space models. In this paper, we present a way to combine the best of both worlds by introducing State Space LSTM (SSL) models that generalizes the earlier work Zaheer et al. (2017) of combining topic models with LSTM. However, unlike ...
متن کاملEnhanced LSTM for Natural Language Inference
Reasoning and inference are central to human and artificial intelligence. Modeling inference in human language is notoriously challenging but is fundamental to natural language understanding and many applications. With the availability of large annotated data, neural network models have recently advanced the field significantly. In this paper, we present a new state-of-the-art result, achieving...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computer Speech & Language
سال: 2020
ISSN: 0885-2308
DOI: 10.1016/j.csl.2019.101034